CORIS/CODIS: A corpus of written Italian based on a defined and a dynamic model
نویسندگان
چکیده
A corpus of written Italian – CORIS – has been under construction at the Centre for Theoretical and Applied Linguistics of Bologna University (CILTA) since 1998 and will soon be completed and made available on-line. The project aims at creating a representative and sizeable general reference corpus of contemporary Italian designed to be easily accessible and user-friendly. CORIS contains 80 million running words and will be updated every two years by means of a built-in monitor corpus. It consists of a collection of authentic texts in electronic form chosen by virtue of their representativeness of written Italian.
منابع مشابه
A dynamic model for reference corpora structure definition
A representative corpus of written Italian – CORIS – constructed at the Centre for Theoretical and Applied Linguistics of Bologna University (CILTA) is available on-line. Considering the importance of the comparability of reference corpora in interlinguistic studies, a further corpus – CODIS – was designed. Aimed at specialist needs, CODIS presents a dynamic and adaptive structure providing for...
متن کاملThe DiaCORIS project: a diachronic corpus of written Italian
The DiaCORIS project aims at the construction of a diachronic corpus comprising written Italian texts produced between 1861 and 1945, extending the structure and the research possibilities of the synchronic 100-million word corpus CORIS/CODIS. A preliminary in depth study has been performed in order to design a representative and well balanced sample of the Italian language over a time period t...
متن کاملCategorial Type Logics and Italian Corpora
In this abstract we will present work in progress on the annotation of Italian Corpora carried out at the Interfaculty Center for Theoretical and Applied Linguistics (CILTA) University of Bologna. The project aims at tagging the 100-million-words synchronic corpus of contemporary Italian, CORIS/CODIS, with syntactic information. In particular, we will focus attention on our first task, namely t...
متن کاملLexical Bundles in English Abstracts of Research Articles Written by Iranian Scholars: Examples from Humanities
This paper investigates a special type of recurrent expressions, lexical bundles, defined as a sequence of three or more words that co-occur frequently in a particular register (Biber et al., 1999). Considering the importance of this group of multi-word sequences in academic prose, this study explores the forms and syntactic structures of three- and four-word bundles in English abstracts writte...
متن کاملTranslation Strategies in English to Persian Translation of Children's Literature based on Klingberg's Model
This research sought to identify the translation strategies adopted by the translator in Persian translation of 'whatever after, Fairest of all' written by 'Sarah Mlynowski' based on Klingberg's model (1986). To achieve the objectives of the study, a qualitative content analysis design was selected for it. The corpus of the study consisted of 60 pages of the novel 'whatever after, Fairest of al...
متن کامل